Search CORE

9 research outputs found

Információ-visszakeresési modellek elméletének és alkalmazási lehetőségeinek kutatása, Web metakereső (fúzió), magyar nyelvű tesztkollekció, nyelvközi keresés = Theoretical and practical research into information retrieval models, Web metasearch (fusion), Hungarian test collection, cross-language retrieval

Author: Dominich Sándor
Horváth Sz. Mária
Skrop Adrienn
Publication venue: OTKA
Publication date: 01/01/2007
Field of study

Megadtuk a kapcsolat alapú Web-visszkereső módszerek egységes formális keretét. Új kapcsolatokra mutattunk rá az információvisszakeresés és: - információelmélet, - számelmélet, - nyelvtechnológia, - orvostudomány, - bonyolultságelmélet, - logika között. Megmutattuk, hogy az asszociatív visszakereső módszer átlagos hatákonysága 0,6. Módszert adtunk meg Webkeresőmotor hatékonyságának mérésére. Entrópia alapú indexkifejezés-kiválasztó eljárást adtunk meg, és megmutattuk, hogy ilyen módon a vektortér visszakereső módszer hatékonysága növelhető. Kifejlesztettük az i2rMeta és a NeuRadIR kereső rendszereket. Kifejlesztettünk egy angol nyelvű orvosi tesztadatbázist, ennek segítségével mértük a NeuRadIR rendszer hatékonyságát. Kifejlesztettünk hat magyar nyelvű tesztadatbázist, ezeket a kisvilág jelenség és az asszociatív módszer vizsgálatában használtuk fel. Eredméyneink tananyag részeivé váltak a Pannon Egyetem Műszaki Informatikai Karán (B.Sc és Ph.D. képzésben), a megfelelő jegyzetek a hallgatók számára (de bármely érdeklődő számára is) ingyenesen elérhetők. A Pannon Egyetemen kívül az eredmények tananyag részeit képezik a következő egyetemeken is: Joint Advanced Student School München, Germany; University of Colorado at Denver, USA; Eidgenossische Technische Hochschule Zürich, Schweiz. | A Unified formal framework for the link-based methods was given. Links between information retrieval and information theory, number theory, language technology, medicine, computational complexity and logics were established. A new method for the measurement of retrieval effectiveness of Web search engines was given. The i2rMeta and NeuRadIR retrieval systems were developed. An English and six Hungarian test databases were developed for laboratory measuremnents of effectiveness. Many of our results have become part of instruction programs at Pannon University, Joint Advanced Student School München, Germany; University of Colorado at Denver, USA; Eidgenossische Technische Hochschule Zürich, Schweiz

Repository of the Academy's Library

Semantic distillation: a method for clustering objects by their contextual specificity

Author: AN Langville
AN Langville
Chris Godsil and Gordon Royle
CJ Rijsbergen van
DM Cvetković
F Fouss
I Yanai
J Mercer
J Shi
JC Bezdek
K Pearson
LA Zadeh
M Belkin
M Campanino
Miklós Rédei
MLD Chiara
MW Berry
N Aronszajn
P Baldi
P Gärdenfors
R Baeza-Yates
R Fan
R Homayouni
RR Coifman
S Vishveshwara
ST Wang
Sándor Dominich
Publication venue
Publication date: 01/01/2007
Field of study

Techniques for data-mining, latent semantic analysis, contextual search of databases, etc. have long ago been developed by computer scientists working on information retrieval (IR). Experimental scientists, from all disciplines, having to analyse large collections of raw experimental data (astronomical, physical, biological, etc.) have developed powerful methods for their statistical analysis and for clustering, categorising, and classifying objects. Finally, physicists have developed a theory of quantum measurement, unifying the logical, algebraic, and probabilistic aspects of queries into a single formalism. The purpose of this paper is twofold: first to show that when formulated at an abstract level, problems from IR, from statistical data analysis, and from physical measurement theories are very similar and hence can profitably be cross-fertilised, and, secondly, to propose a novel method of fuzzy hierarchical clustering, termed \textit{semantic distillation} -- strongly inspired from the theory of quantum measurement --, we developed to analyse raw data coming from various types of experiments on DNA arrays. We illustrate the method by analysing DNA arrays experiments and clustering the genes of the array according to their specificity.Comment: Accepted for publication in Studies in Computational Intelligence, Springer-Verla

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL-Rennes 1

Mathematical foundations of information retrieval

Author: Dominich Sándor
Publication venue: Springer
Publication date: 01/01/2001
Field of study

CERN Document Server

Mathematical Foundations of Information Retrieval

Author: Sándor Dominich
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Crossref

Computational Aspects of Connectionist Interaction Information Retrieval

Author: Sándor Dominich
Zsolt Tuza
Publication venue
Publication date
Field of study

Connectionism represents a soft computing technique that aims at enhancing retrieval effectiveness, and is, at the same time, very computation demanding. In IR, only recently has computational complexity of retrieval algorithms become a research issue, although its practical importance has long been recognized. The paper presents a methodical study of the computational complexity of a connectionist retrieval algorithm, the Associative Interaction retrieval method. After a short description of the method itself, the complexity of weights computation and "winner-takes-all"-based activation spreading (i.e., retrieval) are established. This is followed by an empirical estimate of the probability to have multiple maxima, and by an asymptotic estimate of the probability to have unique maximum

CiteSeerX

ACM SIGIR workshop on mathematical/formal methods in information retrieval MF/IR 2005

Author: Iadh Ounis
Jian-Yun Nie
Sándor Dominich
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref